privacy-preserving collaborative machine learning
A Scalable Approach for Privacy-Preserving Collaborative Machine Learning
We consider a collaborative learning scenario in which multiple data-owners wish to jointly train a logistic regression model, while keeping their individual datasets private from the other parties. We propose COPML, a fully-decentralized training framework that achieves scalability and privacy-protection simultaneously. The key idea of COPML is to securely encode the individual datasets to distribute the computation load effectively across many parties and to perform the training computations as well as the model updates in a distributed manner on the securely encoded data. We provide the privacy analysis of COPML and prove its convergence. Furthermore, we experimentally demonstrate that COPML can achieve significant speedup in training over the benchmark protocols. Our protocol provides strong statistical privacy guarantees against colluding parties (adversaries) with unbounded computational power, while achieving up to $16\times$ speedup in the training time against the benchmark protocols.
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.40)
Review for NeurIPS paper: A Scalable Approach for Privacy-Preserving Collaborative Machine Learning
The initial reviews showed some disagreement about this paper, with two positive reviewers noting the reduction in computational and communication costs compared to prior solutions, and two more negative reviewers with some concerns in particular regarding novelty and comparison with respect to previous work. After reading the author rebuttal and further discussion, the doubts regarding the comparison to recent work were lifted, leading to one reviewer increasing his/her score. While some concerns remain regarding the applicability of the work to non-linear models, the merits of the work are judged significant enough, and we decided the paper should be accepted. In the final version, the authors are asked to be more explicit about the potential limitations of the degree-1 approximation to the sigmod, and to add a discussion about how one may go about extending the approach to more complicated (deep) models.
- Information Technology > Artificial Intelligence > Machine Learning (0.76)
- Information Technology > Data Science > Data Mining > Big Data (0.40)
A Scalable Approach for Privacy-Preserving Collaborative Machine Learning
We consider a collaborative learning scenario in which multiple data-owners wish to jointly train a logistic regression model, while keeping their individual datasets private from the other parties. We propose COPML, a fully-decentralized training framework that achieves scalability and privacy-protection simultaneously. The key idea of COPML is to securely encode the individual datasets to distribute the computation load effectively across many parties and to perform the training computations as well as the model updates in a distributed manner on the securely encoded data. We provide the privacy analysis of COPML and prove its convergence. Furthermore, we experimentally demonstrate that COPML can achieve significant speedup in training over the benchmark protocols.